Boltzmann distribution (scipy.stats.boltzmann)#

In SciPy, the Boltzmann distribution is a truncated discrete exponential distribution on the integers

\[ X \in \{0,1,\dots,N-1\}. \]

It shows up whenever probabilities are proportional to an exponentiated negative “energy”: (;p(k)\propto e^{-\lambda k};) with a hard cutoff at (N).

What you’ll learn#

  • how the PMF/CDF come from finite geometric series

  • closed-form mean/variance (and how to get higher moments)

  • a NumPy-only inverse-CDF sampler

  • practical usage with scipy.stats.boltzmann (and fitting via scipy.stats.fit)

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from dataclasses import dataclass

from scipy import stats

pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

rng = np.random.default_rng(42)
np.set_printoptions(precision=6, suppress=True)

1) Title & Classification#

Name: boltzmann
Type: Discrete distribution (truncated discrete exponential)

Support#

With SciPy’s parameterization (and default loc=0):

\[ \mathcal{X} = \{0,1,\dots,N-1\}. \]

More generally, with a location shift loc:

\[ \mathcal{X} = \{\mathrm{loc},\, \mathrm{loc}+1,\,\dots,\,\mathrm{loc}+N-1\}. \]

Parameter space#

SciPy uses two shape parameters:

  • (\lambda > 0) (often written (\beta) in physics; acts like an inverse temperature)

  • (N \in {1,2,3,\dots})

So the parameter space is ((0,\infty)\times\mathbb{N}_{>0}) (plus an optional integer loc).

2) Intuition & Motivation#

What this distribution models#

The (Gibbs-)Boltzmann idea is:

Lower energy states are exponentially more likely.

If state (k) has “energy” proportional to (k), then

\[ \mathbb{P}(X=k) \propto e^{-\lambda k}. \]

In statistical mechanics, (\lambda) is often (\beta = 1/(k_B T)), where (T) is temperature. Higher temperature (smaller (\lambda)) flattens the distribution; lower temperature (larger (\lambda)) concentrates mass near the minimum-energy state.

SciPy’s boltzmann is the special case where energies are equally spaced and truncated to (k\in{0,\dots,N-1}).

Typical real-world use cases#

  • Thermal equilibrium over discretized energy levels (toy models, truncated state spaces).

  • Softmax / temperature sampling when scores are linear in an integer index.

  • Truncated exponential decay over discrete ranks: higher rank (\Rightarrow) exponentially less likely.

  • Discrete maximum-entropy models under an expected “energy” constraint (finite support).

Relations to other distributions#

  • Geometric distribution: if (r=e^{-\lambda}), then (p(k)\propto r^k). SciPy’s Boltzmann is essentially a geometric distribution truncated to (0,\dots,N-1).

  • Exponential distribution (continuous analog): (f(x)\propto e^{-\lambda x}) on ([0,\infty)).

  • Categorical Gibbs distribution: for arbitrary energies (E_i), (p(i) \propto e^{-\beta E_i}) (a softmax with temperature).

3) Formal Definition#

PMF#

Let (\lambda>0) and integer (N\ge 1). Define the partition function

\[ Z(\lambda,N) = \sum_{j=0}^{N-1} e^{-\lambda j}. \]

Then the probability mass function is

\[ \mathbb{P}(X=k\mid\lambda,N) = \frac{e^{-\lambda k}}{Z(\lambda,N)} = \frac{(1-e^{-\lambda})\,e^{-\lambda k}}{1-e^{-\lambda N}}, \qquad k\in\{0,1,\dots,N-1\}. \]

Outside the support, the PMF is 0.

CDF#

For real (x), the CDF can be written using (m=\lfloor x \rfloor):

\[\begin{split} F(x) = \mathbb{P}(X\le x) = \begin{cases} 0, & x < 0\\ \dfrac{1-e^{-\lambda(m+1)}}{1-e^{-\lambda N}}, & 0 \le m < N-1\\ 1, & x \ge N-1. \end{cases} \end{split}\]

Location shift (loc)#

SciPy also supports an integer shift loc. If (Y=X+\mathrm{loc}), then

\[ \mathbb{P}(Y=y) = \mathbb{P}(X=y-\mathrm{loc}). \]
@dataclass(frozen=True)
class Boltzmann:
    '''Boltzmann (truncated discrete exponential) distribution.

    This matches `scipy.stats.boltzmann` with `loc=0`:
    - support: {0, 1, ..., N-1}
    - pmf is proportional to exp(-lambda_ * k)
    '''

    lambda_: float
    N: int


def _validate_params(lambda_: float, N: int) -> tuple[float, int]:
    lambda_ = float(lambda_)
    if not np.isfinite(lambda_) or lambda_ <= 0:
        raise ValueError(f"lambda_ must be positive and finite; got {lambda_!r}.")

    N_int = int(N)
    if N_int != N or N_int <= 0:
        raise ValueError(f"N must be a positive integer; got {N!r}.")

    return lambda_, N_int


def boltzmann_logZ(lambda_: float, N: int) -> float:
    """log Z(lambda_, N) where Z = sum_{k=0}^{N-1} exp(-lambda_ k)."""
    lambda_, N = _validate_params(lambda_, N)
    # Z = (1 - exp(-lambda_ N)) / (1 - exp(-lambda_))
    num = -np.expm1(-lambda_ * N)  # 1 - exp(-lambda_ N)
    den = -np.expm1(-lambda_)  # 1 - exp(-lambda_)
    return float(np.log(num) - np.log(den))


def boltzmann_logpmf(k, lambda_: float, N: int):
    """Log-PMF evaluated at k (supports scalar or array)."""
    lambda_, N = _validate_params(lambda_, N)

    k = np.asarray(k)
    k_int = k.astype(int)
    is_int = k_int == k
    valid = is_int & (k_int >= 0) & (k_int < N)

    out = np.full(k.shape, -np.inf, dtype=float)

    log_norm = np.log(-np.expm1(-lambda_)) - np.log(-np.expm1(-lambda_ * N))
    out[valid] = log_norm - lambda_ * k_int[valid]

    return out


def boltzmann_pmf(k, lambda_: float, N: int):
    """PMF evaluated at k (supports scalar or array)."""
    return np.exp(boltzmann_logpmf(k, lambda_, N))


def boltzmann_cdf(x, lambda_: float, N: int):
    """CDF evaluated at x (supports scalar or array)."""
    lambda_, N = _validate_params(lambda_, N)

    x = np.asarray(x)
    m = np.floor(x).astype(int)

    out = np.zeros(m.shape, dtype=float)
    out[m >= N - 1] = 1.0

    valid = (m >= 0) & (m < N)
    num = -np.expm1(-lambda_ * (m[valid] + 1))
    den = -np.expm1(-lambda_ * N)
    out[valid] = num / den

    return out


def boltzmann_mean_var(lambda_: float, N: int) -> tuple[float, float]:
    """Closed-form mean and variance.

    Uses stable expm1-based expressions and switches to the uniform limit
    when lambda_ is extremely small.
    """
    lambda_, N = _validate_params(lambda_, N)

    if lambda_ < 1e-8:
        mean = 0.5 * (N - 1)
        var = (N**2 - 1) / 12.0
        return float(mean), float(var)

    r = np.exp(-lambda_)
    rN = np.exp(-lambda_ * N)

    one_minus_r = -np.expm1(-lambda_)  # 1 - r
    one_minus_rN = -np.expm1(-lambda_ * N)  # 1 - rN

    mean = r / one_minus_r - N * rN / one_minus_rN
    var = r / (one_minus_r**2) - (N**2) * rN / (one_minus_rN**2)

    return float(mean), float(var)


def boltzmann_entropy(lambda_: float, N: int) -> float:
    """Shannon entropy H(X) = -E[log p(X)]."""
    mean, _ = boltzmann_mean_var(lambda_, N)
    return float(lambda_ * mean + boltzmann_logZ(lambda_, N))


def boltzmann_mgf(t, lambda_: float, N: int):
    """Moment-generating function M(t) = E[exp(t X)].

    For finite N this exists for all real t.
    """
    lambda_, N = _validate_params(lambda_, N)

    t = np.asarray(t)
    dtype = np.complex128 if np.iscomplexobj(t) else float
    t = t.astype(dtype)

    def geom_sum(a):
        # sum_{k=0}^{N-1} exp(-a k) = (1-exp(-a N))/(1-exp(-a)) with a~0 -> N
        num = -np.expm1(-a * N)
        den = -np.expm1(-a)
        out = num / den
        return np.where(np.isclose(a, 0), N, out)

    return geom_sum(lambda_ - t) / geom_sum(lambda_)


def boltzmann_stats(lambda_: float, N: int) -> dict:
    """Return common moments/properties as a dict."""
    ks = np.arange(int(N))
    pmf = boltzmann_pmf(ks, lambda_, N)

    mean, var = boltzmann_mean_var(lambda_, N)
    centered = ks - mean

    mu3 = float(np.sum((centered**3) * pmf))
    mu4 = float(np.sum((centered**4) * pmf))

    skew = mu3 / (var ** 1.5)
    excess_kurt = mu4 / (var**2) - 3.0

    return {
        'mean': mean,
        'var': var,
        'skew': float(skew),
        'excess_kurtosis': float(excess_kurt),
        'entropy': boltzmann_entropy(lambda_, N),
    }
# Quick sanity checks

lambda_, N = 1.4, 19
k = np.arange(N)

pmf_np = boltzmann_pmf(k, lambda_, N)
print('PMF sums to:', pmf_np.sum())

pmf_sp = stats.boltzmann.pmf(k, lambda_, N)
print('Max |numpy - scipy|:', np.max(np.abs(pmf_np - pmf_sp)))
PMF sums to: 1.0
Max |numpy - scipy|: 1.3877787807814457e-17

4) Moments & Properties#

Partition function#

\[ Z(\lambda,N) = \sum_{k=0}^{N-1} e^{-\lambda k} = \frac{1-e^{-\lambda N}}{1-e^{-\lambda}}. \]

Mean and variance (closed form)#

A convenient trick is to use log-partition derivatives. With (Z(\lambda,N)=\sum_k e^{-\lambda k}):

\[ \mathbb{E}[X] = -\frac{\partial}{\partial\lambda}\log Z(\lambda,N), \qquad \mathrm{Var}(X) = \frac{\partial^2}{\partial\lambda^2}\log Z(\lambda,N). \]

Carrying out the derivatives gives

\[ \mathbb{E}[X] = \frac{e^{-\lambda}}{1-e^{-\lambda}} - \frac{N e^{-\lambda N}}{1-e^{-\lambda N}}, \]
\[ \mathrm{Var}(X) = \frac{e^{-\lambda}}{(1-e^{-\lambda})^2} - \frac{N^2 e^{-\lambda N}}{(1-e^{-\lambda N})^2}. \]

Skewness and kurtosis#

Because the support is finite, all moments exist. Higher moments can be computed either by

  • summing (\sum_k k^m,p(k)) exactly (finite sum), or

  • differentiating (\log Z) further (cumulants).

We report skewness (\gamma_1 = \mu_3/\sigma^3) and excess kurtosis (\gamma_2 = \mu_4/\sigma^4 - 3).

MGF / characteristic function#

Let (M(t)=\mathbb{E}[e^{tX}]). Using the same geometric-series idea:

\[ M(t) = \frac{\sum_{k=0}^{N-1} e^{(t-\lambda)k}}{\sum_{k=0}^{N-1} e^{-\lambda k}} = \frac{Z(\lambda-t, N)}{Z(\lambda, N)}. \]

The characteristic function is (\varphi(\omega)=M(i\omega)).

Entropy#

Using (\log p(X)=-\lambda X - \log Z),

\[ H(X) = -\mathbb{E}[\log p(X)] = \lambda\,\mathbb{E}[X] + \log Z(\lambda,N). \]
# Moments: NumPy formulas vs SciPy

lambda_, N = 1.4, 19

mom_np = boltzmann_stats(lambda_, N)
mean_sp, var_sp, skew_sp, kurt_sp = stats.boltzmann.stats(lambda_, N, moments='mvsk')
ent_sp = stats.boltzmann.entropy(lambda_, N)

print('NumPy moments:', mom_np)
print('SciPy mean/var/skew/kurt:', float(mean_sp), float(var_sp), float(skew_sp), float(kurt_sp))
print('SciPy entropy:', float(ent_sp))

# MGF spot-check: compare closed form M(t) to finite sum
k = np.arange(N)
pmf = boltzmann_pmf(k, lambda_, N)

t = 0.3
mgf_sum = float(np.sum(np.exp(t * k) * pmf))
mgf_closed = float(np.real(boltzmann_mgf(t, lambda_, N)))
print('MGF sum vs closed:', mgf_sum, mgf_closed)
NumPy moments: {'mean': 0.3273108178480401, 'var': 0.4344431884043245, 'skew': 2.510337952872525, 'excess_kurtosis': 8.30179503342743, 'entropy': 0.7413900989073794}
SciPy mean/var/skew/kurt: 0.3273108178480401 0.4344431884043245 2.5103379528725243 8.301795033427425
SciPy entropy: 0.7413900989073795
MGF sum vs closed: 1.1293215104592877 1.1293215104592875

5) Parameter Interpretation#

Meaning of the parameters#

  • (\lambda) controls how fast probability decays with increasing (k).

    • larger (\lambda) (\Rightarrow) much more mass near 0 (low “energy”)

    • smaller (\lambda) (\Rightarrow) flatter distribution; as (\lambda\to 0), it approaches a discrete uniform on ({0,\dots,N-1})

  • (N) sets the maximum state (a hard cutoff). Increasing (N) extends the right tail and increases the mean/variance.

Shape changes#

Below, we hold (N) fixed and vary (\lambda).

# PMF shapes for different lambda_

N = 25
lambdas = [0.1, 0.3, 0.8, 1.5]
ks = np.arange(N)

rows = []
for lam in lambdas:
    pmf = boltzmann_pmf(ks, lam, N)
    rows.append(
        {
            'k': ks,
            'pmf': pmf,
            'lambda_': np.full_like(ks, lam, dtype=float),
        }
    )

df = {
    'k': np.concatenate([r['k'] for r in rows]),
    'pmf': np.concatenate([r['pmf'] for r in rows]),
    'lambda_': np.concatenate([r['lambda_'] for r in rows]),
}

fig = px.line(
    df,
    x='k',
    y='pmf',
    color='lambda_',
    markers=True,
    title=f"Boltzmann PMF for N={N} (varying lambda_)",
)
fig.update_layout(xaxis_title='k', yaxis_title='P(X = k)')
fig.show()

6) Derivations#

Expectation via the log-partition function#

Start with

\[ Z(\lambda,N) = \sum_{k=0}^{N-1} e^{-\lambda k}. \]

Differentiate:

\[ \frac{\partial}{\partial\lambda} Z(\lambda,N) = \sum_{k=0}^{N-1} (-k) e^{-\lambda k}. \]

Divide by (Z) to get a derivative of (\log Z):

\[ \frac{\partial}{\partial\lambda} \log Z = \frac{1}{Z}\,\frac{\partial Z}{\partial\lambda} = -\sum_{k=0}^{N-1} k\,\frac{e^{-\lambda k}}{Z} = -\mathbb{E}[X]. \]

So (\mathbb{E}[X] = -\partial_\lambda \log Z). A second derivative yields

\[ \mathrm{Var}(X) = \partial_{\lambda}^2 \log Z. \]

Likelihood (iid sample)#

Given observations (x_1,\dots,x_n) with known (N), the log-likelihood is

\[ \ell(\lambda) = \sum_{i=1}^n \log p(x_i\mid\lambda,N) = -\lambda \sum_i x_i - n\log Z(\lambda,N). \]

The score equation is

\[ \frac{\partial\ell}{\partial\lambda} = -\sum_i x_i - n\frac{\partial}{\partial\lambda}\log Z(\lambda,N) = -n\bar{x} + n\,\mathbb{E}_{\lambda}[X]. \]

Setting this to 0 yields a classic exponential-family moment match:

\[ \mathbb{E}_{\hat\lambda}[X] = \bar{x}. \]

Because (\mathbb{E}_{\lambda}[X]) decreases monotonically in (\lambda) for (\lambda>0), this equation has a unique solution when (\bar{x}\in(0,(N-1)/2]). If (\bar{x}) is larger than ((N-1)/2), the best fit under the (\lambda>0) constraint occurs near the boundary (\lambda\to 0) (almost-uniform).

def boltzmann_loglik(lambda_: float, data: np.ndarray, N: int) -> float:
    lambda_, N = _validate_params(lambda_, N)
    x = np.asarray(data, dtype=float)
    if np.any((x < 0) | (x >= N) | (x != np.floor(x))):
        raise ValueError('All observations must be integers in {0,...,N-1}.')

    x_sum = float(x.sum())

    # log p(x) = log(1-exp(-lambda_)) - lambda_ x - log(1-exp(-lambda_ N))
    log_norm = np.log(-np.expm1(-lambda_)) - np.log(-np.expm1(-lambda_ * N))
    return float(len(x) * log_norm - lambda_ * x_sum)


def boltzmann_mle_lambda(data: np.ndarray, N: int, *, tol: float = 1e-10, max_iter: int = 200) -> float:
    """MLE for lambda_ with known N under lambda_ > 0.

    Solves E_lambda[X] = x̄ with a monotone bisection search.
    """
    N = int(N)
    x = np.asarray(data)
    xbar = float(np.mean(x))

    # boundary cases
    mean_uniform = 0.5 * (N - 1)
    if xbar >= mean_uniform:
        return 0.0  # corresponds to the lambda_ -> 0 limit (uniform)
    if xbar <= 0:
        return float('inf')

    # Find an upper bracket where mean <= xbar
    lo = 1e-12
    hi = 1.0
    while boltzmann_mean_var(hi, N)[0] > xbar:
        hi *= 2.0
        if hi > 1e6:
            break

    for _ in range(max_iter):
        mid = 0.5 * (lo + hi)
        m_mid = boltzmann_mean_var(mid, N)[0]
        if m_mid > xbar:
            lo = mid
        else:
            hi = mid
        if hi - lo < tol * max(1.0, hi):
            break

    return hi


# Demo: recover lambda_ from simulated data
true_lambda, N = 1.2, 15
x = stats.boltzmann.rvs(true_lambda, N, size=4000, random_state=rng)

lambda_hat = boltzmann_mle_lambda(x, N)

print('true lambda_ :', true_lambda)
print('mle  lambda_ :', lambda_hat)
print('sample mean  :', float(np.mean(x)))
print('theory mean@true:', boltzmann_mean_var(true_lambda, N)[0])
print('theory mean@hat :', boltzmann_mean_var(lambda_hat, N)[0])

print('loglik(true):', boltzmann_loglik(true_lambda, x, N))
print('loglik(hat) :', boltzmann_loglik(lambda_hat, x, N))
true lambda_ : 1.2
mle  lambda_ : 1.188392420649003
sample mean  : 0.43825
theory mean@true: 0.4310125322436335
theory mean@hat : 0.4382499999510091
loglik(true): -3537.129610521816
loglik(hat) : -3536.960983993395

7) Sampling & Simulation#

NumPy-only inverse CDF sampling#

Because the support is finite, a simple and robust approach is inverse transform sampling:

  1. Compute (p(k)) for (k=0,\dots,N-1)

  2. Form the CDF: (F(k)=\sum_{j\le k} p(j))

  3. Draw (U\sim\mathrm{Unif}(0,1))

  4. Return the smallest (k) such that (F(k)\ge U)

This is (\mathcal{O}(N)) preprocessing and then (\mathcal{O}(\log N)) per sample via binary search (np.searchsorted).

def logsumexp_np(a: np.ndarray) -> float:
    '''NumPy-only log-sum-exp for 1D arrays.'''
    a = np.asarray(a, dtype=float)
    a_max = float(np.max(a))
    return float(a_max + np.log(np.sum(np.exp(a - a_max))))


def boltzmann_rvs_numpy(lambda_: float, N: int, size: int, *, rng: np.random.Generator) -> np.ndarray:
    lambda_, N = _validate_params(lambda_, N)

    ks = np.arange(N)
    logw = -lambda_ * ks
    logw -= logsumexp_np(logw)
    p = np.exp(logw)

    cdf = np.cumsum(p)
    cdf[-1] = 1.0  # protect against roundoff

    u = rng.random(size)
    return np.searchsorted(cdf, u, side='right')


# Monte Carlo check
lambda_, N = 0.7, 30
samples = boltzmann_rvs_numpy(lambda_, N, size=200_000, rng=rng)

mean_theory, var_theory = boltzmann_mean_var(lambda_, N)
mean_mc = float(np.mean(samples))
var_mc = float(np.var(samples))

print('theory mean/var:', mean_theory, var_theory)
print('MC     mean/var:', mean_mc, var_mc)
theory mean/var: 0.9864338408867821 1.959484948528839
MC     mean/var: 0.98722 1.9630466716000001

8) Visualization#

We’ll visualize:

  • the PMF (bars)

  • the CDF (step-like curve)

  • Monte Carlo samples compared to the theoretical PMF

lambda_, N = 0.6, 25
ks = np.arange(N)

pmf = boltzmann_pmf(ks, lambda_, N)
cdf = boltzmann_cdf(ks, lambda_, N)

fig_pmf = go.Figure(
    data=[go.Bar(x=ks, y=pmf, name='PMF')],
    layout=go.Layout(
        title=f"Boltzmann PMF (lambda_={lambda_}, N={N})",
        xaxis_title='k',
        yaxis_title='P(X = k)',
    ),
)
fig_pmf.show()

fig_cdf = go.Figure(
    data=[go.Scatter(x=ks, y=cdf, mode='lines+markers', name='CDF')],
    layout=go.Layout(
        title=f"Boltzmann CDF (lambda_={lambda_}, N={N})",
        xaxis_title='k',
        yaxis_title='P(X ≤ k)',
        yaxis=dict(range=[0, 1.05]),
    ),
)
fig_cdf.show()

# Monte Carlo samples
samples = boltzmann_rvs_numpy(lambda_, N, size=50_000, rng=rng)
counts = np.bincount(samples, minlength=N)
emp_p = counts / counts.sum()

fig_mc = go.Figure()
fig_mc.add_trace(go.Bar(x=ks, y=emp_p, name='Empirical (MC)', opacity=0.7))
fig_mc.add_trace(go.Scatter(x=ks, y=pmf, name='Theoretical PMF', mode='lines+markers'))
fig_mc.update_layout(
    title=f"Monte Carlo vs theory (n={len(samples):,})",
    xaxis_title='k',
    yaxis_title='Probability',
)
fig_mc.show()

9) SciPy Integration#

SciPy provides scipy.stats.boltzmann as an rv_discrete distribution.

Common methods:

  • pmf, logpmf

  • cdf, sf

  • rvs

  • stats (mean/var/skew/kurtosis)

  • entropy

Fitting#

Unlike many continuous distributions, rv_discrete distributions don’t expose a .fit(...) method. In modern SciPy, you can fit discrete or continuous distributions with scipy.stats.fit.

# Basic SciPy usage
lambda_, N = 1.0, 20
rv = stats.boltzmann(lambda_, N)  # frozen distribution

ks = np.arange(N)
print('pmf[:5]:', rv.pmf(ks[:5]))
print('cdf[:5]:', rv.cdf(ks[:5]))

s = rv.rvs(size=10, random_state=rng)
print('rvs:', s)

mean_sp, var_sp = rv.stats(moments='mv')
print('mean/var:', float(mean_sp), float(var_sp))

# Fit using scipy.stats.fit (fix N and loc, estimate lambda_)
true_lambda, true_N = 1.2, 15
x = stats.boltzmann.rvs(true_lambda, true_N, size=2500, random_state=rng)

fit_res = stats.fit(
    stats.boltzmann,
    x,
    bounds={
        'lambda_': (1e-6, 10.0),
        'N': (true_N, true_N),  # keep fixed (common in practice)
        'loc': (0, 0),
    },
)

print(fit_res)
pmf[:5]: [0.632121 0.232544 0.085548 0.031471 0.011578]
cdf[:5]: [0.632121 0.864665 0.950213 0.981684 0.993262]
rvs: [1 1 3 0 0 2 2 0 2 1]
mean/var: 0.5819766656462539 0.92067276974634
  params: FitParams(lambda_=1.2148140377734147, N=15.0, loc=0.0)
 success: True
 message: 'Optimization terminated successfully.'

10) Statistical Use Cases#

(A) Hypothesis testing: likelihood ratio test (LRT)#

Suppose (N) is known and you want to test

  • (H_0: \lambda = \lambda_0)

  • (H_1: \lambda) free

The LRT statistic is

\[ \Lambda = 2\big(\ell(\hat\lambda) - \ell(\lambda_0)\big) \overset{\cdot}{\sim} \chi^2_1 \]

for large samples (Wilks’ theorem).

(B) Bayesian modeling: grid posterior over (\lambda)#

Because (\lambda) is 1D, you can do simple grid Bayes:

\[ \log p(\lambda\mid x) = \log p(\lambda) + \ell(\lambda) + C. \]

(C) Generative modeling: Gibbs / softmax over energies#

More generally, for a finite set of energies (E_i):

\[ \mathbb{P}(i) = \frac{e^{-\beta E_i}}{\sum_j e^{-\beta E_j}}. \]

This is exactly a softmax with temperature (T=1/\beta).

# (A) Likelihood ratio test example

N = 20
lambda0 = 0.8

# Simulate under an alternative
true_lambda = 1.1
x = stats.boltzmann.rvs(true_lambda, N, size=4000, random_state=rng)

lambda_hat = boltzmann_mle_lambda(x, N)

lrt = 2.0 * (boltzmann_loglik(lambda_hat, x, N) - boltzmann_loglik(lambda0, x, N))
p_value = stats.chi2.sf(lrt, df=1)

print('lambda0:', lambda0)
print('lambda_hat:', lambda_hat)
print('LRT statistic:', lrt)
print('p-value:', float(p_value))
lambda0: 0.8
lambda_hat: 1.0982790587590858
LRT statistic: 334.1736246842629
p-value: 1.1851773758214812e-74
# (B) Bayesian grid posterior for lambda_

N = 20
x = stats.boltzmann.rvs(1.0, N, size=200, random_state=rng)

lam_grid = np.linspace(1e-3, 4.0, 600)

# Example prior: Gamma(shape=2, scale=1) on lambda_ (mean=2)
prior = stats.gamma(a=2.0, scale=1.0)
log_prior = prior.logpdf(lam_grid)

log_like = np.array([boltzmann_loglik(lam, x, N) for lam in lam_grid])
log_post_unnorm = log_prior + log_like
log_post_unnorm -= np.max(log_post_unnorm)
post = np.exp(log_post_unnorm)
post /= np.trapz(post, lam_grid)

# Posterior summaries
post_mean = float(np.trapz(lam_grid * post, lam_grid))
cdf = np.cumsum(post)
cdf /= cdf[-1]
ci_low = float(np.interp(0.025, cdf, lam_grid))
ci_high = float(np.interp(0.975, cdf, lam_grid))

print('posterior mean:', post_mean)
print('95% credible interval:', (ci_low, ci_high))

fig = go.Figure()
fig.add_trace(go.Scatter(x=lam_grid, y=post, mode='lines', name='posterior'))
fig.add_vline(x=post_mean, line_dash='dash', line_color='black', annotation_text='posterior mean')
fig.update_layout(title='Posterior over lambda_ (grid approximation)', xaxis_title='lambda_', yaxis_title='density')
fig.show()
posterior mean: 0.9563436545547456
95% credible interval: (0.821300695264435, 1.0950861245578618)
# (C) Gibbs / softmax sampling over arbitrary energies

energies = np.array([0.0, 0.5, 1.0, 2.0, 3.0])
labels = [f'state {i}' for i in range(len(energies))]

betas = [0.5, 1.0, 2.0]  # inverse temperatures

fig = go.Figure()
for beta in betas:
    w = np.exp(-beta * energies)
    p = w / w.sum()
    fig.add_trace(go.Scatter(x=labels, y=p, mode='lines+markers', name=f'beta={beta}'))

fig.update_layout(title='Gibbs probabilities vs inverse temperature', xaxis_title='state', yaxis_title='probability')
fig.show()

# Sample from one setting
beta = 1.5
p = np.exp(-beta * energies)
p = p / p.sum()

draws = rng.choice(len(energies), size=20, p=p)
print('draws:', draws)
draws: [1 0 2 0 2 0 0 1 0 0 0 0 3 0 0 0 2 0 0 1]

11) Pitfalls#

  • Parameter constraints (SciPy): lambda_ must be (>0); N must be a positive integer.

  • Support mismatch: data must be integers in ({0,\dots,N-1}) (or shifted by loc).

  • Fitting N: if N is unknown, estimating it from finite data can be unstable; using max(data)+1 can underestimate the true cutoff.

  • Small (\lambda) numerical cancellation: expressions like (1-e^{-\lambda}) lose precision when (\lambda) is tiny; use expm1 (-np.expm1(-x)) or switch to the uniform limit.

  • CDF rounding: ensure the last CDF value is exactly 1.0 before searchsorted sampling.

# Tiny-lambda numerical pitfall demo

lam = 1e-12
naive = 1 - np.exp(-lam)
stable = -np.expm1(-lam)
print('1 - exp(-lam) naive :', naive)
print('1 - exp(-lam) stable:', stable)
1 - exp(-lam) naive : 9.999778782798785e-13
1 - exp(-lam) stable: 9.999999999995e-13

12) Summary#

  • boltzmann is a discrete, finite-support distribution with (p(k)\propto e^{-\lambda k}) on ({0,\dots,N-1}).

  • The PMF/CDF come directly from finite geometric series.

  • Mean/variance have clean closed forms via derivatives of (\log Z).

  • Sampling is straightforward with inverse CDF on the finite support.

  • SciPy provides pmf/cdf/rvs/stats/entropy; for fitting, use scipy.stats.fit or solve the MLE moment equation.